Product Quantization, Embedding Compression, Memory Efficiency, Approximate Search
Scaling Laws for LLM Based Data Compression
lesswrong.comยท11h
Trainable Dynamic Mask Sparse Attention
arxiv.orgยท14h
Principal Component Analysis (PCA) is the gold standard in dimensionality reduction.
threadreaderapp.comยท3h
LeetCode #70: Climbing Stairs
anmoltomer.bearblog.devยท14h
Welcome GPT OSS, the new open-source model family from OpenAI!
huggingface.coยท18h
Beyond Manually Designed Pruning Policies with Second-Level Performance Prediction: A Pruning Framework for LLMs
arxiv.orgยท14h
E-VRAG: Enhancing Long Video Understanding with Resource-Efficient Retrieval Augmented Generation
arxiv.orgยท14h
Information Rates of Approximate Message Passing for Bandlimited Direct-Detection Channels
arxiv.orgยท14h
Filtering with Self-Attention and Storing with MLP: One-Layer Transformers Can Provably Acquire and Extract Knowledge
arxiv.orgยท14h
Kernel-Based Sparse Additive Nonlinear Model Structure Detection through a Linearization Approach
arxiv.orgยท14h
Loading...Loading more...